Week 6: The Ground Game

The Ground Game

This week’s focus is on mobilization of voters and how voter turnout may have an impact on who does well in elections. Most often, campaigns will focus on turning out those who are infrequent voters that align well with their party and or candidate’s platform. Methods include texting, calling, and knocking on potential voter’s doors to get out the vote. One big question I will try to investigate is whether or not there’s a relationship between high turnout and the outcomes of either one of the major parties.

I will create a predictive model for turnout for 2022 and then test it against its predictions for 2018 versus actual outcomes to see how closely it will predict. Then, after I add in the variable of turnout into my existing model from previous weeks I will update my visualization and predictions.

Visualizing Voter Turnout v. Party Major Vote Percent in 2018

Higher Turnout in 2018 Helped Democrats

## 
## Call:
## lm(formula = DemVotesMajorPercent ~ turnout, data = dist_pv_cvap_closed)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -34.060 -12.411  -2.361  11.883  45.749 
## 
## Coefficients:
##             Estimate Std. Error t value            Pr(>|t|)    
## (Intercept)   46.623      5.607   8.314 0.00000000000000152 ***
## turnout       11.132     11.075   1.005               0.315    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 16.4 on 393 degrees of freedom
## Multiple R-squared:  0.002564,   Adjusted R-squared:  2.583e-05 
## F-statistic:  1.01 on 1 and 393 DF,  p-value: 0.3155

Using just the 2018 data, I plotted the turnout and different parties major vote share percentage. As you can see, higher turnout had a slightly negative correlation to Republican Major Vote Percent while it had a positive correlation with Democratic Major Vote Percent. There are slightly more registered Democrats than Republicans in the US but Democrats are largely less active voters than Republican counterparts. This correlation in interesting to see at play. I think turnout will definitely help my model’s fit this week.

Note that 2018 results in comparison to the model in lab (which included more years) has a greater correlation it seems of voter turnout and democratic major vote pct.

Adding in Voter Turnout as a Predictor to Model

Model that uses previous midterm election data since 2012 to predict voter turnout for the 2022 election. I excluded presidential years since they skew data towards a higher turnout (Insert Citation) and 2022 doesn’t happen to be a presidential year.

Observations 1308
Dependent variable turnout
Type OLS linear regression
F(53,1254) 51.97
0.69
Adj. R² 0.67
Est. S.E. t val. p
(Intercept) 183.77 5.01 36.69 0.00
year -0.09 0.00 -36.70 0.00
state.xAlaska 0.08 0.05 1.82 0.07
state.xArizona -0.04 0.02 -1.70 0.09
state.xArkansas -0.05 0.03 -1.99 0.05
state.xCalifornia 0.01 0.02 0.57 0.57
state.xColorado 0.09 0.02 4.00 0.00
state.xConnecticut 0.08 0.02 3.26 0.00
state.xDelaware -0.13 0.05 -2.75 0.01
state.xFlorida -0.02 0.02 -0.84 0.40
state.xGeorgia 0.03 0.02 1.53 0.13
state.xHawaii -0.03 0.03 -0.94 0.34
state.xIdaho -0.03 0.03 -0.87 0.39
state.xIllinois 0.07 0.02 3.95 0.00
state.xIndiana -0.02 0.02 -1.16 0.25
state.xIowa 0.09 0.03 3.30 0.00
state.xKansas 0.04 0.03 1.32 0.19
state.xKentucky 0.02 0.02 0.70 0.48
state.xLouisiana -0.13 0.02 -5.48 0.00
state.xMaine 0.13 0.03 4.10 0.00
state.xMaryland 0.07 0.02 3.35 0.00
state.xMassachusetts 0.02 0.02 1.16 0.25
state.xMichigan 0.08 0.02 4.01 0.00
state.xMinnesota 0.19 0.02 8.69 0.00
state.xMississippi -0.07 0.03 -2.59 0.01
state.xMissouri -0.01 0.02 -0.26 0.79
state.xMontana -0.14 0.05 -2.85 0.00
state.xNebraska 0.13 0.03 4.51 0.00
state.xNevada -0.00 0.03 -0.07 0.95
state.xNew Hampshire 0.11 0.03 3.28 0.00
state.xNew Jersey 0.03 0.02 1.37 0.17
state.xNew Mexico 0.04 0.03 1.34 0.18
state.xNew York -0.03 0.02 -1.40 0.16
state.xNorth Carolina 0.02 0.02 1.08 0.28
state.xNorth Dakota 0.04 0.05 0.90 0.37
state.xOhio 0.03 0.02 1.30 0.19
state.xOklahoma -0.10 0.02 -3.87 0.00
state.xOregon 0.04 0.03 1.65 0.10
state.xPennsylvania 0.03 0.02 1.73 0.08
state.xRhode Island 0.15 0.03 4.44 0.00
state.xSouth Carolina -0.02 0.02 -0.88 0.38
state.xSouth Dakota -0.02 0.05 -0.39 0.70
state.xTennessee -0.07 0.02 -3.23 0.00
state.xTexas -0.04 0.02 -2.03 0.04
state.xUtah 0.02 0.03 0.89 0.37
state.xVermont 0.09 0.05 2.02 0.04
state.xVirginia 0.04 0.02 2.05 0.04
state.xWashington 0.09 0.02 4.16 0.00
state.xWest Virginia -0.01 0.03 -0.17 0.86
state.xWisconsin 0.15 0.02 7.03 0.00
state.xWyoming 0.10 0.05 2.14 0.03
president_partyR 0.52 0.01 39.68 0.00
cvap 0.00 0.00 15.55 0.00
winner_candidate_incIncumbent -0.01 0.01 -2.16 0.03
Standard errors: OLS

This model uses previous midterm election data since 2012 to predict voter turnout for the 2022 election. I excluded presidential years since they skew data towards a higher turnout (Insert Citation) and 2022 doesn’t happen to be a presidential year. It has an R-squared of 0.69. It takes state, year, president_party, cvap, and incumbency into account. I will be using this to predict the overall Democratic Major Vote Percent as well as the seat distribution for the 2022 Midterm election.

Evaluating model’s prediction for voter turnout vs. actual voter turnout in 2018.

Here, I am plotting differences in margin for turnout to test the accuracy of my prediction variable for 2022 turnout to add to my model later. Red indicates that the actual value is less than predicted value, therefore my model is under predicting in the red areas. The same goes for the blue. Where its more blue, such as in Florida, the predictive model is having a hard time and is over predicting these areas for voter turnout. I am curious to see what exact variables in certain states are causing this phenomenon.

This Week’s Model Prediction

Predicting for Major Democratic Vote Percent

Observations 5934
Dependent variable DemVotesMajorPercent.x
Type OLS linear regression
F(5,5928) 3478.47
0.75
Adj. R² 0.75
Est. S.E. t val. p
(Intercept) 95.93 1.22 78.80 0.00
Unemployed_prct 0.92 0.22 4.22 0.00
winner_candidate_inc.xIncumbent 1.43 0.28 5.15 0.00
Receipts -0.00 0.00 -5.24 0.00
turnout -38.80 1.42 -27.39 0.00
avg -6.64 0.05 -126.89 0.00
Standard errors: OLS

The R-squared of my model was slightly increased, from 0.74 to 0.75. Another important thing to note about this model is how significant the avg variable seems to be, which is indicative of the polling avg rating from 1 (Solid Democrat) to 7 (Solid Republican). The least important variable in terms of significance for this model is the rate of unemployment, suggesting that the economy is less of an indicator than the conditions of voter turnout, polling predictions, and money spent on ads.

Visualizing 2022 Midterm Vote Prediction

Final Thoughts and Predictions

My final prediction for week 6 is the following:

Democratic Major Vote Percent Democratic Seats:

Republican Major Vote Percent: Republican Seats:

Notes from This Week / Issues

I tried to improve my model by doing a few things:

  • Lost values when merging data sets and filled in NA’s for districts within polling values to be set as 3.1667 for being a “toss up”

  • Standardized any NA’s in ad spend data with the mean of all.

  • Added more years to my model in order to try to get a better picture and prediction rather than just one month of one year in previous models.